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Abstract. We study the fc-largest eigenvalues of heavy-tailed sample covariance matrices of the 
form XX T in an asymptotic framework, where the dimension of the data and the sample size tend 
to infinity. To this end, we assume that the rows of X are given by independent copies of some 
stationary process with regularly varying marginals with index a e (0,2) satisfying large devi- 
ation and mixing conditions. We apply these general results to stochastic volatility and GARCH 
processes. 



1. Introduction 

In the statistical analysis of high-dimensional data one often tries to reduce its dimensionality 
while preserving as much of the variation in the data as possible. One important example of such 
an approach is the Principal Component Analysis (PCA). PCA makes a linear transformation of 
the data to a new set of variables, the principal components, which are ordered such that the first 
few retain most of the variation. Therefore one obtains a lower dimensional representation of the 
data by retaining only the first few principal components. 

The variances of the first k principal components are given by the fc-largest eigenvalues of 
the covariance matrix. Let us collect the samples of our multivariate data in a p x n matrix X, 
where we refer to p as the dimension of the data and to n as the sample size. In practice, the true 
underlying covariance matrix is not available, thus one usually replaces it with the sample cov- 
ariance matrix ^XX T . For more details on PCA we refer the reader to one of the many textbooks 
available on this topic, see [2] or [13], for example. 

To account for large high-dimensional data sets, we study the ^-largest eigenvalues of the 
sample covariance matrix when both the dimension of the data p as well as the sample size n 
go to infinity. The field of research that investigates the spectral properties of large dimensional 
random matrices has become known as Random Matrix Theory (RMT). There exist several sur- 
vey articles which stress the close relationship between Random Matrix Theory and multivariate 
statistics, including PCA, see e.g. [10] and [12]. Some authors have already employed tools from 
Random Matrix Theory to correct traditional tests or estimators which fail when the dimension 
of the data cannot be assumed to be negligible compared to the sample size. For example, Bai 
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et al. [3] gives corrections on some likelihood ratio tests that even fail even for moderate dimen- 
sion (around 20), and El Karoui [1 1] consistently estimates the spectrum of a large dimensional 
covariance matrix using Random Matrix Theory. 

Davis, Pfaffel and Stelzer [9] study the fc-largest eigenvalues of a sample covariance matrix 
based on observations that come from a high-dimensional linear process with heavy-tailed mar- 
ginals. It is often necessary to use non-linear instead of linear models to capture the complex 
dependence structure of the data. This is particularly true in finance where the log-returns exhibit 
both non-linearity and heavy-tails. The objective of this paper is to extend some of the results of 
[9] to a non-linear setting. 

In Section 2 we study non-linear processes with regularly varying tail probabilities with index 
less than two which satisfy certain large deviation and mixing conditions. We then apply our 
results to two heavily employed models in finance; stochastic volatility models in Section 3 and 
GARCH(p,g) processes in Section 4. More background on these processes and other financial 
time series models may be found in [1], for example. We assume throughout sections 2-4 that 
the dimension p ^n^ with B>0 satisfying B < if 1 < or < 2. This restriction is rather general, 
however, if a is close to 2, it becomes quite strict. Therefore we will present a result for the 
largest eigenvalue of XX T in Section 5 which holds for p > n independently of the value of a as 
long as < a < 2. 

This article makes heavy use of the theory of regular variation and point processes. A good 
introduction may be found in Resnick [15], for example. 



2. Eigenvalues of heavy- tailed sample covariance matrices of stationary processes 

Throughout this section we assume that (X t ) is a strictly stationary sequence of random vari- 
ables with marginals that are regularly varying with tail index smaller than two. In other words, 
there exist a normalizing sequence (a„) and an a e (0, 2) such that, for any x > 0, 



—a 



(2.1) nP(\X \ > a n x) — > x 

n—>oo 

Moreover we assume that Xq satisfies the tail balancing condition, i.e., that the limit 

hm exists. 

x^cx, P(\X \ > X) 

Then we construct our p x n observation matrix X = (Xit)u as follows: for each I < i < p, let 
(Xit)i<t<n be an independent copy of (X t )\< t < n . This means that (X; f ) f and (X t ) t have the same 
distribution, and all rows of X are independent. We denote by X\ , . . . , A p > the eigenvalues of 
XX T and study them via their induced point process 

p 

X = \{l<i<p: a- 2 p A< e b}\ , B c (0, oo). 

!=1 

Our motivation to study this problem comes from the statistical analysis of high-dimensional 
data. Therefore we assume in the following that p = p n is an integer valued sequence such that 
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p n — > oo as n — > oo. 

The upcoming theorem is a combination of results from [9] where mostly linear processes are 
studied. This general theorem will be the basis of all our further results. 



Theorem 2.1. Assume that there exists ab > such that 

( n \ 

(2.2) P pj^xf>a 2 np x — > bx~ a/2 for each jc> 0. 

Suppose that p = p n — » oo and n — » oo swc/i ?/za? 

(2.3) limsup^-<oo 

/J— >oo 

/or some fi>0, satisfying /5 < 1 < a < 2. Then we have that 

p CO 

i=l i=l 

as n—> oo, where Ti = E\+.. ,+Ei is the successive sum of independent and identically distributed 
( iid) Exponential random variables Ek with mean one. 

The convergence in (2.4) means that, for any function / : (0, oo) — > (0, oo) with compact support, 

E L-ZUfK 2 M _^ £ U^/(^: J, ")j, 

iVoo/ Proposition 3.3 of [9] shows that 

(2.5) a- 2 \\XX T -D\\ 2 -AO. 

This means that XX T can be approximated by its diagonal. Using Weyl's inequality ([5, Corol- 
lary III.2.6]), its eigenvalues are therefore asymptotically equal to its diagonal entries. By [15, 
Proposition 3.21], the large deviation result (2.2) implies that the point process of the diagonal 
entries of XX T converges to a Poisson point process, 

p CO 

(2.6) > €-2y« X 2 > y 6 2/ -2/a. 

Z—i u np i-if=l it «— >oo Z— I ' 1 : 

i=\ i=l 

Along the lines of the proof of [9, Theorem 1] it follows that this results carries over to the 
eigenvalues, since, as mention before, they behave like the diagonal entries. □ 

Remark 2.2. Observe that (2.4) immediately implies the joint convergence of the fc-largest ei- 
genvalues of XX T in distribution. Denote by > ... > A( p ) > the eigenvalues of XX T in 
decreasing order. Then we have, for any fixed integer k, that 

a- 2 p (A a) ,...,A( k) ) b 2 ' a {Y- 2la ,...,Yf a ). 

1 n—*oo 1 * 

If b = 0, then the normalized eigenvalues converge to zero in probability. 
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The large deviation condition (2.2) is essentially equivalent to the convergence of the point 
process of the partial sums of Xj to a limiting point process. Instead of having a condition on 
the partial sums, it would be much more convenient in many cases to have a condition on the 
process itself. Davis and Hsing [6] give very general conditions under which the point process 
convergence of the sequence Xj gives a large deviation result for the partial sums of Xf . This 
will be stated in our next theorem. 



Theorem 2.3. Assume that 
(2.7) 



;=1 



-2v2 

i n— *oo 



oo oo 
!=1 j=l 



where e Pj w a Poisson process on (0, oo) with some intensity measure v, and QL°j = \ e g,y)/ w 
a sequence ofiid point processes on [-1,0) U (0, 1] independent ofY^Li ep r Further assume that 
the sequence (X t ) is strongly mixing. If p,n — » oo such that (2.3) is satisfied, then we have (2.4) 
with 

f oo \ 



(2.8) 



b = lim 

£->0 



^wGi;%oo)(w|ghl) > 1 



U=l 



v(du) e [0,oo). 



Proof. Under the above conditions, [6, Theorem 4.3] shows that 



p(r; =l xf>a 2 npX ) 



nP{X^ > a„ p x) 
Since pnP(X 2 > a 2 np x) — » x~ a ^ 2 , this implies 



lim P Y \uQiil {s ,oo)(u\Qu\) > 1 
5-0 Jo ^ J 



v(du) 



P p 



2 2 
> a np x 



bx 



-a/2 



An application of Theorem 2. 1 completes the proof. □ 

Remark 2.4. The assumption that the sequence (X t ) is strongly mixing can be replaced by the 
much weaker assumption [6, (2.1)]. 

In the remainder of this article we apply the results of this section to stochastic volatility and 
GARCH processes. For a stochastic volatility process the constant b in (2.4) is essentially one, 
see (3.4). In the case of a GARCH process the characterization of b is more involved. 



3. Stochastic Volatility Models 

In this section we use the previously introduced techniques to obtain results for the eigen- 
values of sample covariance matrices when the observations are given by stochastic volatility 
models. More precisely, the rows of the observation matrix X are given by independent copies of 
a univariate stochastic volatility process X t = cr t Z t , and A\ , . . . , A p denote the eigenvalues of XX T . 
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Theorem 3.1. Assume that (Z t ) is an iid sequence with regularly varying tails with index a e (0, 2) 
and normalizing sequence (a n ). This means we have that 

(3.1) nP(\Zo\ > a n x) — > x~ a for any x> 0. 

n — >oo 

Moreover we assume that Zq satisfies the tail balancing condition 

(3.2) lim lf^ X \ = ge[0,l] exists. 

X-°°P(|Z | >X) 

Let o~ t >0 be a stationary sequence with Ecr^ a < oo independent of (Z t ). Then X t = o~ t Z t defines 
a stochastic volatility process. Suppose p = p n — > oo such that 

n — »oo 

Pn Pn 

< liminf — — and lim sup —r- < oo 

n^>ca HP* n ^>oo n? 2 

for some </3\ < 02, where 02 < ^Ef in case 1 < a < 2. Then we have, as n —> oo, that 

p 



;'=1 i=l 

where r, = E\ + ... + Ei is the successive sum of iid Exponential random variables E\ with mean 
one. 

Proof. Theorem 4.2 of [14] applied to Xj = ajzf gives that 

P(r t=l Xf>al p x) 
lim = 1 . 

n-^co n P(X^ > af ip x) 

Since pnP(X 2 > a 2 np x) —> x~ a ^ 2 Ecr^, this implies that 

pP 



( n \ 

-or/2 P _a 



V X} > al 



/i t ™-np 



— » X- a,z Ecr" 
n— >oo 



Theorem 2.1 concludes. 



Remark 3.2. Note that the factor (Ecr^) 2 ^ a in the limiting point process in (3.3) appears due 
to the fact that a n is the normalizing sequence of Z\ and not X\ . If a n denotes the normalizing 
sequence of X\ , then we have that 

p oo 

i=\ i=\ 

Thus we get the same result for a stochastic volatility process as we get for an iid sequence. 
The intuitive reason for this is that the dependence within the sequence (X t ) is inherited by the 
light-tailed sequence (cr t ). Hence, the extremes of (X t ) are asymptotically independent. 
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A combination of [14, Theorem 4.6] and Theorem 2.1 can be used to obtain analogous results 
for regularly varying Markov chains. 

The additional assumption of < liminf^oo in Theorem 3.1 is needed to apply Theorem 
4.2 of [14]. We can eliminate this assumption at the expense of imposing mixing properties on 
the volatility sequence (07). This is the content of the following theorem. 

Theorem 3.3. Let X t = o~ t Z t be a stochastic volatility process such that \Zq\ is regularly varying 
satisfying (3.1) and (3.2) with tail index a e (0,2) and normalizing sequence (a n ). Assume that 
the volatility sequence (cr t ) is independent of(Z t ) and either 



• a stationary sequence ofm-dependent non-negative random variables such that Ecr^ 
00 for some 6 > 0, or 

• the exponential of a linear process with Gaussian noise, i.e., 



a+S 



< 



10gCT f = tykgt-k, 



where (&) is a sequence ofiid mean-zero Gaussian random variables and (t/fi) is sum- 
mable. 

Suppose that p,n —> 00 such that (2.3) is satisfied. Then we have (3.3). 

Proof. Depending on the choice of the volatility process (crt), Theorem 3.1. or Theorem 3.3. of 
Davis and Mikosch [8] yield that 



00 00 



1=1 (=1 
where the limiting point process is Poisson with intensity measure 

v(dx) = aEo-Q[qx~ a ~ l l(o,oo)(x) + (1 - q)(-x)~ a ~ l l(- m ,o)(x)]dx. 

An application of the continuous mapping theorem yields that 

i=l (=1 ;=1 

The limiting point process e Pi of the squares has intensity measure 

v(dx) = aEo-QX~ a ~ l l(o,oo)(x)dx 

and representation (2.7) with = 1 if i = j and zero otherwise. For either choice of (cr t ), this 
sequence is strongly mixing. Davis and Mikosch [8] have shown that (X t ) is strongly mixing if 
(cr t ) is. Thus we can apply Theorem 2.3. Due to the simple structure of Q, we have that 

pOO 00 poo 

b = ]im \ P V «Qii%oo)(M|<2ij|) > 1 v(du) = --Eo-% \ u~ a/2 - l du = Ecr". 
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4. GARCH processes 
A GARCH(p,g) process (X t ) with p > 1 or q > 1 is given by the equations 
(4.1) X t =cr t Z„ 

p i 
oj =a + ^ aiXli + YjPrf-j' 

i=\ j=\ 

where a, and /3j are non-negative coefficients such that ao > 0, a p > if p > 1 and /3 q > if q > 1, 
and (Z t ) is an iid sequence. This is not a stochastic volatility model since (cr^) is not independent 
of (Z f ). We will make use of the results of Section 2 to obtain the point process convergence of 
the eigenvalues of XX T when the rows of X are given by iid copies of a GARCH process. 

Theorem 4.1. Suppose that (Z f ) is an iid sequence satisfying the following conditions: 

• Z\ is symmetric, i.e., Z\ = -Z\, and has a strictly positive density on R. 

• There exists an ho e (0, oo] such that E\Z\ \ h < oofar h < ho and E\Z\ | /!() = oo. 

Assume there exists a unique strictly stationary solution (X t ) to equation (4.1). Then X\ is regu- 
larly varying with index a > and some normalizing sequence (a n ). Suppose that a < 2. 
Then we have, as p,n — » oo such that (2.3) is satisfied, that the convergence in (2.4) holds with b 
as given in (2.8). 

The intensity measure v in (2.8) is given by v(x, oo) = yx~ a for some y e (0, 1], and (Qij) satisfies 
supj \Qij\ = 1 for all i > 1. The distribution of(Qij) is described in detail in Theorem 2.8 of Davis 
and Mikosch [7]. 

Proof. By Basrak, Davis and Mikosch [4, Corollary 3.5] we obtain that (X t ) is strongly mixing 
and has regularly varying tail probabilities with index a > 0. Since we assume that a < 2, we can 
apply Theorem 2.3. In [4, Theorem 2.10] it is shown, using the results from [7], that the point 
process £" =1 e a - 2 x 2 nas a limiting point process with representation (2.7). For details, see also the 
proof of [4, Theorem 3.6]. □ 

Remark 4.2. (J) Note that y e (0, 1] is the extremal index of the sequence (\X t \). 

(ii) In the case when (X t ) is a GARCH(1, 1) process, the theorem simplifies considerably. If 
ao,a\,P\ > 0, then (X t ) has a unique strictly stationary solution if and only if 

-oo <£log(aqZj+/3i) <0, 

cf. [1, p. 47]. Let us further assume that EZ\ = and EZ^ = 1. In this case, the index a of 
regular variation of Xi is given by the unique solution to the equation 

E[(a l Z 2 l +/3 l ) a ] = l. 

To finish this section we want to mention that, using the results from [4] and [7], the same sort 
of argument could be applied to the class of solutions of stochastic recurrence equations. 
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5. The largest eigenvalue in the case when p > n 

In the previous sections we assumed that (2.3) is satisfied for some > with fi < jj^j in the 
case 1 < a < 2. This implies for a e (1.5,2) that p grows slower than n. For example, in statistical 
genetics one is often confronted with the case that the dimension of the data p (e.g., the number 
of genes) is much larges than the sample size n (e.g., the number of patients). This may also 
happen in finance when one considers the problem of optimizing a large portfolio. Therefore we 
present a result in the upcoming theorem for the largest eigenvalue of XX T that holds for p > n 
independently of the value of a as long as < a < 2. It shows that the largest eigenvalue A max of 
XX T is asymptotically equal to the maximum of the squares of the entries of X. 

Theorem 5.1. Let (X t ) be a strictly stationary stochastic process which satisfies (2.1) with nor- 
malizing sequence (a n ) and tail index a e (0,2). Assume that (\X t \) has extremal index equal to 
one. Further let p = p„ = n K for some k>\. Suppose, for any x>0, that 

PQTt-tVtAXlnpX) 

lim = 1 , 

rc^oo nP(\X \ > a np x) 

and 

PiZUXj > a 2 np x) 
lim = 1. 



n^co nP(X 2 > al D x) 



n P 

Assume the rows of X are given by independent copies of (X t ), and denote by A max the largest 
eigenvalue o/XX T . Then we have 

(5.1) ^ 1. 

maxi</< P; i< f <„X z f >^°° 

In particular, this implies that 

lim POW < a 2 np x) = exp(-x~ a/2 ). 



Proof. Since (\X t \) has extremal index equal to one, and maxi< f <„ < J^ =1 \X t \, we have, as 
n — > oo, that 

J P(maxi< f <„ \X t \ > a np max{x,y}) 



l-o(l)<- 



nP(\X \ > a np max{x,y}) 



< P(Z^=l Kl > a,i P x,maxi< t < n \X t \ > q np y) ^ P(Z" =1 Kl > a n pmax{x,y}), 
nP(\X \ > a np max{x,y}) ~ nP(\X \ > a np max{x,y\) 



The latter converges to one by assumption, thus 

pP(X'U \%t\ > a„ p x,maxi< t < n \X t \ > a np y) 
npP(\X \ > a np max{x,y}) 
Hence 

p oo 

Zj e «np(2?=i I^LmaXK^K,!) ^ J] e r7 1/Q: (l,l)- 
i=l ;=1 



1. 
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This yields 

max^Kfl max*,, \X it \ n-»«> 
Likewise, we obtain for the squares 

max,- y" , X\ p 

(5.3) ,Zj '=\ * 1. 

max, 

Now we proceed like in the proof of [9, Theorem 2]. Since max,- y," =1 X? r < /l max < ||X||^,, we 
obtain by (5.2) and (5.3) that 

^max P j 



max 



l<i<p Zj t =\ A it 



The above theorem applies to stochastic volatility models, as we show in the upcoming corol- 
lary. It should not be mistaken as a special case of Theorem 3.1 since the growth condition on p n 
is different. 

Corollary 5.2. Let X t = cr t Z t such that (Z t ) is an iid sequence with regularly varying tails satis- 
fying (3.1) and (3.2) with index < a < 2 and normalizing sequence (a n ). Assume that o~ t >0 is a 
stationary sequence independent of (Z t ) that is strongly mixing with rate function rj = 0(j~ a ) with 
a > 1. Further assume that Eo~^ +s < oofar some 5 > 0. Let n —> oo and suppose that p = p n = n K 
for some k > 1. Then we have (5.1). This implies that 



lim P 



. =cxp(-x- a/2 y 

{a 2 np (Eo%)V- 



^max 

< X 



Proof. Since the square of a stochastic volatility process is again a stochastic volatility process, 
it suffices to show that 

Pdr t - X \X t \>a np x) 
lim — — = 1 . 

,7^00 nP(\Xo\ > a np x) 

For < a < 1 this has been done in [14, Theorem 4.2] (note that a np = n £+ ^ a since p = n K ). For 
1 < a < 2 this has been shown for the centered sum. However, since k > 1 and a < 2, we have 
that nE\X§\la np converges to zero, therefore a centering is not needed. □ 
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